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Abstract  -  Estimating  remaining  targets  after  some 
attempt  has  been  made  to  detect  an  overall ,  unknown 
number  of  targets  is  critical  to  determining  the  po¬ 
tential  threat  associated  with  these  remaining  tar¬ 
gets.  This  paper  presents  a  Bayesian  approach  to 
calculate  the  distribution  on  the  number  of  remain¬ 
ing  targets  given  the  sensor  performance  and  the 
number  of  targets  detected.  For  a  single  sensor,  a 
closed  form  posterior  distribution  on  remaining  tar¬ 
gets  is  derived.  For  multiple  sensors,  the  correspond¬ 
ing  posterior  distribution  is  developed.  A  naive  im¬ 
plementation  of  this  calculation  is  shown  to  be  com¬ 
putationally  prohibitive,  and  an  efficient  means  for 
performing  the  calculation  is  presented. 

Keywords:  sensor  management,  Dirichlet-Multinomial 
hierarchical  model. 

1  Introduction 

In  Mine  Countermeasure  operations,  sensors  with  an 
associated  probability  of  detection  attempt  to  find  an 
unknown  number  of  mines  on  the  seafloor.  After  these 
operations,  some  estimation  must  be  made  of  the  num¬ 
ber  of  remaining  mines  in  order  to  predict  the  remain¬ 
ing  threat  in  the  area.  The  combined  efforts  of  multiple 
sensors  allocated  to  sub-areas,  each  having  an  associ¬ 
ated  probability  of  detection,  must  also  be  considered. 

An  estimation  of  the  remaining  targets  after  a  cer¬ 
tain  level  of  search  is  required  to  evaluate  the  poten¬ 
tial  threat  associated  with  these  remaining  targets.  We 
consider  the  problem  of  estimating  an  unknown  num¬ 
ber  of  stationary  targets  based  on  the  number  of  tar¬ 
gets  detected  by  a  given  sensor  and  the  probability 
of  detection  associated  with  this  sensor.  The  problem 
of  determining  an  unknown  number  of  targets  is  then 
considered  for  a  larger  area  where  many  sensors  are 
working  independently  in  several  sub-areas  of  the  over¬ 
all  area.  The  quantity  of  interest  is  the  total  number 
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of  unknown  targets  in  the  entire  area.  Given  uniform 
sensor  performance,  the  algorithm  provides  the  same 
results  for  a  single  sensor  working  in  an  area,  or  multi¬ 
ple  sensors  working  in  several  sub-areas  or  this  original 
area  when  the  total  number  of  targets  detected  is  the 
same. 

A  standard  application  of  Bayes’  Theory  is  to  esti¬ 
mate  the  unknown  success  probability  p  of  the  Bino¬ 
mial  distribution  for  a  population  of  a  fixed  size  given 
a  certain  number  of  observed  successes.  Looking  at  the 
Binomial  distribution  from  another  perspective,  an  es¬ 
timation  can  also  be  made  of  the  population  size  based 
on  a  given  success  probability  and  number  of  observed 
successes.  In  general,  the  problem  of  estimating  the 
number  of  trials  n  for  the  Binomial  distribution  has 
received  little  attention  [2]. 

The  distribution  on  remaining  targets  is  developed 
in  Section  2.  For  the  single  sensor  case,  assuming  a 
infinite  uniform  distribution  on  the  total  number  of 
targets,  the  posterior  distribution  is  shown  to  follow  a 
Negative  Binomial  distribution.  The  posterior  distri¬ 
bution  on  remaining  targets  is  then  developed  for  mul¬ 
tiple  sensors.  The  prior  distribution  should  be  chosen 
in  a  way  to  give  the  same  results  as  the  single  sensor 
case  under  certain  circumstances.  This  requires  the 
introduction  of  a  hierarchical  model  to  provide  a  flexi¬ 
ble  prior  distribution  giving  intuitive  results.  Section  3 
sets  up  the  calculations  of  interest  as  expected  values. 
The  separable  form  of  these  expectation  calculations 
is  then  exploited  in  Section  4  giving  an  efficient  way 
of  calculating  the  expected  values  for  the  quantities  of 
interest. 

2  Distribution  on  targets  re¬ 
maining 

A  Bayesian  approach  is  used  to  determine  a  distribu¬ 
tion  on  targets  remaining  in  the  area.  Before  develop¬ 
ing  this  distribution,  we  review  the  basic  definitions  in 
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Bayesian  inference  [4,  5]  .  We  would  like  to  make  state¬ 
ments  about  an  unknown  parameter  9  given  data  y. 
The  joint  probability  distribution  P(9,  y)  follows  from 
the  definition  of  conditional  probability: 

P(9,y)  =  P(9)P(y\9),  (1) 

where  P{9)  is  referred  to  as  the  prior  distribution  and 
P(y\9)  as  the  sampling  distribution,  or  the  likelihood 
function.  Bayes’  rule  follows  from  another  application 
of  the  definition  of  conditional  probability: 


where  (7)  follows  from  the  fact  that  p  is  a  con¬ 
stant  with  respect  to  n.  This  is  the  kernel  of  the 
NegativeBinomial(m  +  l,p)  for  n  =  m,m  +  1,..., 
and  putting  this  into  the  context  of  remaining  tar¬ 
gets  r  =  n  —  m,  r  ~  NegativeBinomial(m  +  1  ,p)  for 
0.1.2 . 

l'liiis.  assuming  the  improper  uniform  prior,  given 
the  observation  of  five  heads  in  an  unknown  number  of 
coin  flips  of  a  fair  coin,  the  number  of  tails  follows  a 
NegativeBinomial(6,  0.5)  distribution. 


_  P(9,y)  _  P(e)P(y\9) 

P(y )  P{y)  ’ 


(2) 


where  P(y)  =  J2g  P(^)P(v |$)  sums  over  all  possible 
values  of  9.  The  symbol  “oc,”  proportional  to,  will  be 
used  for  the  unnormalized  posterior  density 


P{9  y)  cx  P(9)P(y  9),  (3) 


as  considering  terms  up  to  a  constant  of  normaliza¬ 
tion  can  be  beneficial.  When  the  unnormalized  pos¬ 
terior  density  is  not  a  recognizable  form  (and  there¬ 
fore  the  constant  is  not  known),  the  constant  of  nor¬ 
malization  can  be  computed  by  summing  (or  inte¬ 
grating  in  the  case  9  is  continuous)  the  unnormal¬ 
ized  posterior  density  over  the  entire  sample  space 
P(y)  =  EeP(0)P(y\e). 


2.1  Single  Sensor:  the  Posterior  Distri¬ 
bution 

Consider  the  game  where  a  fair  coin  is  flipped  an  un¬ 
known  number  of  times  and  the  resulting  number  of 
heads  is  five.  Intuition  would  suggest  that  the  coin 
was  flipped  about  ten  times,  but  some  variation  on 
this  would  be  expected.  Observing  only  the  number  of 
heads  from  an  unknown  number  of  coin  flips  is  the  same 
problem  as  having  observed  a  certain  number  of  targets 
from  an  unknown  number  of  total  targets.  Assuming 
a  probability  of  detection  p  for  a  given  sensor  which 
has  detected  m  targets,  the  distribution  of  the  total 
number  of  targets  is  Binomial(n,  p)  where  n  =  m  +  r 
and  r  represents  the  number  of  remaining  targets. 

In  order  to  derive  a  closed  form  posterior  distribu¬ 
tion,  an  improper  (infinite),  uniform  prior  distribu¬ 
tion  on  the  number  of  targets  in  the  area  is  chosen, 
n  ~  DiscreteUnif  orm(0,  oo).  This  can  be  thought  of 
as  limjv-xx)  jywy  for  n  =  0,1,2,....  Computation¬ 
ally,  DiscreteUnif  orm(0,  N)  for  some  large  N  would 
give  similar  results  as  the  improper  prior.  In  either 
case,  as  N  does  not  depend  on  n,  this  term  can  be 
considered  as  the  constant  of  normalization.  Using  (3) 
to  estimate  the  unknown  n  based  on  observed  to,  the 
posterior  distribution  for  n  given  in  is: 


(X 

P(n)P(m\n) 

(4) 

oc 

P(m\n) 

(5) 

oc 

(6) 

oc 

(to' )pm+1(1-P)(n~m) ’ 

(7) 

2.2  Multiple  Sensors 

Assume  now  that  for  each  sensor  i  £  1 , ,T  working 
in  some  sub-area,  there  is  an  associated  number  of  de¬ 
tected  targets  to*  and  probability  of  detection  p, .  To 
give  some  intuition  to  the  problem,  suppose  we  have 
two  weighted  coins,  coin  one  with  associated  probabil¬ 
ity  p\  of  landing  on  heads,  and  coin  two  with  proba¬ 
bility  p2-  The  same  game  can  be  constructed  as  in  2.1. 
For  example,  given  p\  =  0.25  and  p2  =  0.75  where  we 
observe  five  heads  from  coin  one  and  none  from  coin 
two,  we  can  ask  the  number  of  total  flips  from  both 
coins,  as  well  as  the  number  of  likely  flips  from  coin 
one  and  coin  two. 

In  the  context  of  this  example,  the  combinatorial 
nature  of  the  sample  space  is  apparent.  In  the  case  of 
a  single  coin,  we  were  concerned  only  with  the  number 
of  flips.  In  the  case  of  two  coins,  we  have  several  ways 
that  ten  flips  may  have  occurred:  ten  flips  of  coin  one, 
nine  of  coin  one  and  one  of  coin  two,  and  so  on.  In 
fact,  for  k  coins  and  n  flips,  there  are  (n^kfL1)  ways 
this  may  have  been  observed  [1]. 

For  multiple  sensors,  we  require  that  the  estimation 
of  remaining  targets  be  the  same  as  in  2.1  when  the 
probability  of  detection  and  total  targets  detected  does 
not  change  (i.e.  the  resulting  overall  distribution  for 
one  sensor  with  p  =  0.8  and  to  =  10  is  the  same  as  for 
10  sensors,  each  with  pi  =  0.8  and  to,  =  1).  This  as¬ 
sumption  means  that  the  total  number  of  targets  needs 
to  be  viewed  as  a  single  entity.  For  example,  a  mine 
field  can  be  viewed  not  as  a  collection  of  individual 
mines,  but  as  a  single  deterrent  with  a  common  goal  of 
blocking  transit.  This  assumption  plays  an  important 
role  in  the  derivation  of  the  posterior  distribution. 

The  derivation  of  the  posterior  distribution  is  di¬ 
vided  into  the  calculation  of  the  likelihood  function 
(Section  2.3)  and  of  the  prior  distribution  (Section  2.4). 
The  resulting  posterior  distribution  is  then  given  in  2.5. 

2.3  The  Likelihood  Function 

The  likelihood  function  is  a  straightforward  general¬ 
ization  of  the  single  sensor  case.  Given  the  number  of 
targets  detected  in  each  of  the  T  sensors’s  sub-areas 
m  =  (toi,  . . . ,  Tot)  and  the  probabilities  of  detection 
for  each  sensor  p  =  {p-[ , . . . ,  pr),  we  determine  the  like¬ 
lihood  function  for  the  total  targets  in  the  areas  of 
n  =  (m, . . . ,  tit).  Each  p,  is  the  probability  of  detect¬ 
ing  any  given  target  in  the  sub-area.  Then  for  each 
sub-area,  we  determine  the  likelihood  of  having  ob¬ 
served  ?n,;  targets  given  that  n*  targets  are  actually  in 
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the  sub-area.  As  the  probability  depends  only  on  the  with  pdf 
sub  area,  information  in  each  sub-area  is  independent. 

Thus,  the  likelihood  of  the  data  m  given  n  and  p  is 
given  by: 


n 


(ii) 


T 

P(m|n,  p)  =  nfMpHl (8) 
i  \miJ 

i=l 

2.4  The  Prior  Distribution 

In  order  to  obtain  the  same  result  as  in  2.1,  the  same 
infinite  uniform  prior  on  the  total  number  of  targets 
n  =  ELi  ni  must  be  assumed.  This  is  not  the  same 
as  assuming  a  uniform  prior  distribution  on  the  entire 
sample  space.  For  example,  consider  the  idea  of  flip¬ 
ping  two  fair  coins  a  number  of  times  and  observing 
only  information  about  the  number  of  heads  observed 
from  each  coin.  Our  prior  distribution  must  be  defined 
for  n  =  0, 1, 2, . . .,  i.e.  for  zero  total  flips,  one  total  flip, 
two  total  flips,  etc.  From  two  coins,  there  is  one  way 
to  have  no  flips,  two  ways  to  have  one  flip,  three  ways 
to  have  two  flips,  and  n  +  1  ways  to  have  n  flips.  If 
all  of  these  outcomes  have  the  same  prior  weight,  more 
weight  is  assigned  to  three  total  flips  than  to  two  total 
flips  because  there  are  more  ways  that  more  flips  could 
be  realized.  For  k  coins  the  situation  would  be  even 
more  noticeable  as  there  are  ("t^1 )  ways  of  realizing 
n  flips. 

From  this  discussion  it  becomes  apparent  that  the 
prior  distribution  must  be  defined  on  n  =  (m, . . . ,  tit) 
and  n  =  X)i=i  must  follow  a  uniform  distribution  to 
ensure  the  same  results  as  2.1  in  the  case  where  sensor 
performance  and  total  targets  detected  remains  con¬ 
stant.  The  prior  distribution  is  now  expressed  as  the 
product  of  the  distribution  on  total  targets  and  the 
distribution  of  the  targets  between  sub-areas  given  the 
total  number  of  targets.  As  P(n)  does  not  depend  on 
n  and  is  therefore  part  of  the  constant  of  normaliza¬ 
tion,  we  then  see  that  the  choice  of  prior  is  really  a 
choice  of  the  distribution  of  targets  among  sub-areas 
n  =  (ni, . . .  ,71t)  for  a  fixed  number  of  total  targets 
n  =  Le-> 

P(  n)  =  P{n)P{n\n)  (9) 

oc  P(n|n).  (10) 

The  first  choice  of  prior  on  the  n  =  (ni, . . .  ,nr)  for 
a  fixed  n  =  ni  is  the  Multinomial  Distribution 
(2.4.1).  The  posterior  distribution  resulting  from  this 
choice  is  shown  to  depend  on  the  observed  rrii  only  via 
their  sum  M  =  ^ m,.  A  hierarchical  model  is  then 
considered  for  P(n|n)  (2.4.3). 

2.4.1  Multinomial  Distribution 

To  determine  the  prior  distribution  we  require  a  dis¬ 
tribution  on  the  on  the  n  =  (ni, . . .  ,tit)  for  a  fixed 
n  =  E*= ini-  Given  x  =  (x\, . . . ,  Xt)  where  each  Xi 
represents  an  assumed  a  priori  probability  that  any 
given  target  is  in  the  sub-area  of  sensor  i,  a  first  ap¬ 
proach  is  to  consider  a  Multinomial(x,  n)  distribution 


In  order  to  understand  the  implications  of  this 
choice,  the  resulting  posterior  distribution  on  remain¬ 
ing  mines  must  be  examined.  Equation  3  is  applied 
to  the  unknown  n  for  observed  data  m.  Combining 
equations  8,  10  and  11  the  posterior  distribution  is: 


P(n|m,  p) 

oc  P(m|n)P(n)  (12) 


by  cancelling  ancl  removing  all  terms  which 

are  constant  with  respect  to  n,  i.e  depend  only  on  m 
and  p.  In  terms  of  remaining  targets  r  for  r,  =  rii  —  m i 
we  have: 

T  r  i  i 

P(r|m,p)  a  n!  —  (1  -  Pipx?  ,  (15) 

,  T  i'. 

1=1  L  J 

since  x™*  is  also  constant.  What  is  interesting  about 
the  posterior  distribution  resulting  from  the  multino¬ 
mial  prior  on  n| n  is  that  the  observed  rrq’s  have  com¬ 
pletely  disappeared  in  all  but  the  n!  =  (EiLi(TOi+r*))! 
term.  This  happened  because  the  term  IL=i  (m+ny. 
was  cancelled  from  the  numerator  of  the  likelihood 
function  and  the  denominator  of  the  prior  distribution. 
This  cancellation  of  terms  left  the  posterior  distribu¬ 
tion  dependent  on  the  observed  ?n,;’s  only  by  their  sum 
M  =  ZLi  mi ■  This  means  that  there  is  no  distinction 
between  the  case  where  we  flip  two  coins  and  observe 
five  heads  from  coin  one  and  zero  from  coin  two,  and 
the  case  where  we  observe  zero  from  coin  one  and  five 
from  coin  two,  as  well  as  the  case  of  two  from  one  and 
three  from  the  other.  In  this  case,  the  probabilities  Xi 
are  the  only  term  influencing  the  estimation  of  remain¬ 
ing  targets  by  type.  As  the  Xi  are  unknown  and  only 
an  estimate,  we  would  like  to  find  an  approach  which 
allows  the  data  to  influence  the  final  answers.  That 
is,  we  would  like  the  prediction  of  coin  flips  from  coin 
one  and  coin  two  to  depend  on  the  number  of  observed 
heads  from  coin  one  and  coin  two. 

In  order  to  do  this,  a  mixture  model  is  proposed. 
The  concept  of  a  mixture  model  is  first  introduced  us¬ 
ing  the  beta-binomial  mixture  distribution  for  a  two 
sensor  case  in  2.4.2  and  this  is  then  developed  for  the 
multiple  sensor  case  using  the  Dirichlet-Multinomial 
mixture  model  in  2.4.3. 

2.4.2  The  beta-binomial  mixture  distribution 

A  random  variable  Y  is  mixture  distribution  if  the  dis¬ 
tribution  of  Y  depends  on  a  quantity  which  also  has 
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a  distribution  [3,  6].  Mixture  distributions  are  also 
referred  to  as  hierarchical  models.  An  example  of  a 
mixture  distribution  is  the  Beta-Binomial  distribution 
where 

F|P~  Binomial(n,  P) 

P  ~  Beta(a,  0) 

and  the  Beta  distribution  is  defined  by: 


equal  weight.  For  a  =  (3  =  1,  there  is  no  assumption 
about  the  relative  densities  for  the  two  sensors.  The 
data  determine  the  posterior  distribution.  If  there  is 
good  prior  information  to  determine  the  relative  den¬ 
sities  for  the  sub-areas,  a  and  /?  can  be  chosen  so  that 
they  will  have  more  of  an  impact  on  the  final  answers. 
Large  a,  f3  will  have  a  larger  impact  on  the  final  an¬ 
swers  and  smaller  values  will  have  less  influence  on  the 
final  answers. 


=  (16) 

for  a  >  0,  (3  >  0,  0  <  x  <  1. 

The  marginal  distribution  of  Y  over  the  joint  dis¬ 
tribution  of  (Y,  P)  can  then  be  computed.  Here,  the 
binomial  distribution  is  being  used  to  allocate  the  tar¬ 
gets  to  the  sub-areas  of  sensor  one  and  sensor  two. 
Total  targets  are  represented  by  n,  y  is  the  number 
of  targets  in  the  sub-area  of  sensor  one,  and  p  is  the 
probability  that  a  target  is  in  sub-area  one.  As  p  is 
unknown,  we  consider  it  to  be  a  random  variable  P 
and  give  P  a  Beta(a,/3)  distribution  for  some  chosen 
a,P,  i.e.  the  f(x\a,0)  from  equation  16  is  substituted 
for  f(jp)  in  equation  19.  The  marginal  distribution  of 
Y  is  then: 


2.4.3  A  Dirichlet-Multinomial  Mixture 

The  same  idea  can  be  applied  to  a  Dirichlet- 
Multinomial  mixture  for  T  sensors.  We  start  with  the 
Multinomial  distribution  described  in  2.4.1.  Then,  we 
consider  the  s  not  as  fixed  parameters,  but  as  ran¬ 
dom  variables.  The  Dirichlet(ai, . . . ,  or)  distribu¬ 
tion  for  X  =  (Ad, ... ,  Xt)  is  defined  as: 


/(x  a)  = 


r(£f=i«i)  T 


n 


nur(a<)  t= 


(23) 


for  ca  >  0,  0  <  Xi  <  1  and  J^f=  l  *»  =  1  (see  [5,  7]). 

Using  this  in  the  same  way  as  the  Beta  Distribution 
was  used  with  the  Binomial  Distribution,  the  Dirichlet- 
Multinomial  hierarchical  model  is  of  the  form: 


P{Y  =  y) 


=  P(Y  =  j/,0  <  P  <  1) 

(17) 

NuN2,...,Nt \XuX2,...,Xt~ 

P 

Multinomial(ni, . . . ,  nr,  x\, . . . ,  Xt) 

=  J 

1  f(y,p)  dp 

0 

r1 

(18) 

xux2,...,xT~ 

Dirichlet(ai, . . . ,  cut). 

=  J 

'  f(y\p)f(p) dp 

0 

(19) 

As  in  2.4.2,  we  compute  the  marginal  distribution  of 

)o  \y 


py(i-p)n-v] 


n\  T(a  +  (3) 


yj  T(a)r(/3) 

f  p(!/+«-1)(l_p)(™-!/+^-1)dp 


(20) 


(21) 


(iVi, . . . ,  Nt)  by  conditioning  on  (Xi, . . . ,  Xt). 

P(n|n) 

=  /  •••  /  /(n,x)dxi  •  •  •  dxT-i  (24) 


lx  1=0  j XT— 1=0 
rl 


i  ST  T-2 

i-Ei=i  xi 


/( n  x)/(x)cte i  •  •  •  dxi 25) 


lx  1=0  j XT— 1=0 


JO 

(  n\  T(a  +  0)  T(y  +  a)T(n  -  y  +  (3) 
\y)T{a)T{0)  T(n  +  a  +  f3)  ’  1  j 

since  the  integrand  is  the  kernel  of  the  Beta (y  +  a,  n  — 
y  +  0)  pdf  and  therefore  the  integral  must  integrate  to 
the  reciprocal  of  the  normalizing  constant. 

The  Beta-Binomial  mixture  was  shown  using  pa¬ 
rameters  more  familiar  to  the  Binomial,  n,m  and  p, 
and  Beta,  a  and  (3,  distributions.  Putting  this  back 
into  the  notation  of  multiple  sensors  with  an  index  i, 
for  the  prior  distribution  on  P{n\ .  n-2 1  n)  we  substitute 
n%  =  y  and  ri2  =  n  —  y  the  beta-binomial  mixture 
above,  i.e.  ni,  ri2 \n  ~  Beta(ni  +  a,  ni  +  0)  .  Although 
the  two  parameters  a  and  (3  must  still  be  chosen  (as 
the  Xi  in  2.4.1),  they  provide  much  more  flexibility 
than  the  For  example,  a  =  (3  =  1  simplifies  to 
Tff.  This  means  that  for  a  fixed  n  the  prior  distribu¬ 
tion  is  uniform  on  the  sample  space  as  for  each  n  we 
have  n  +  1  combinations  and  each  combination  is  given 


i  V'  T-2 

fl-Li=i  xt 


nl 


lx  1=0 


r(£;=i«i) 


xt — i  =0 
T 


i  I  *^1 

n i!  •  •  • 


n;=ir(a<)  fj 

r(ELiai) 
ni!---nT!  nLr(ai) 
P  P  ~Yl 


I"I  Xi‘i  1dx\  ■  ■  •  dxr-l 


(26) 


xm+ai-l 


dx i  •  •  •  dxT-{27) 


f  x\  —0 


!  xt — l — 0 


nl  IXE^UU)  n,=ir(^  +  «i) 
m!---nT!  nf=1  r(a4)  r(£f=1K  +  a,)) 


(28) 


again  using  the  fact  that  Y\A=ixT+ai  l  kernel  of 
a  Dirichlet(ai  +  n\, . . . ,  o.t  +  nr)  pdf,  and  therefore 
the  integral  must  be  equal  to  the  reciprocal  of  the  nor¬ 
malization  constant.  Note  that  n  =  Y^i= i  has  also 
been  used  to  simplify  the  notation. 


-4- 


NURC  Reprint  Series 


NURC-PR-2006-005 


As  were  a  and  (3  in  the  Beta-Binomial  mixture,  the 
the  a*  are  fixed  parameters.  Choosing  all  a*  =  1  has 
the  same  effect  of  giving  uniform  weight  to  all  the  com¬ 
binations  of  the  rii  for  a  fixed  n.  Large  a,  will  have 
more  impact  on  the  posterior  distribution  than  small 
values  of  a*. 

2.5  The  Posterior  Distribution 

Combining  equations  8  and  28  using  3,  the  posterior 
distribution  on  nip, . . . ,  nr  given  the  to, ,p,  is  then  de¬ 
fined  by: 


P(n|m,  p) 

oc  P(m|n)P(n)  (29) 

T 

i—1  '  l' 

n !  CELl  <*i)  nl=l  r(ur  +  Otj) 

nf=i nil  nf=ir(a<)  ‘  r(n+Ef=1a,) 

As  with  the  single  sensor  case  we  consider  the  poste¬ 
rior  distribution  on  the  targets  remaining  r*  =  n*  —  nii 
for  i  =  1,...,T: 


P(r  |m,  p) 

T 


oc 


r(r,; 


1) 


r  (m*  +  l)r(ri  + 1) 


pr^-Pi)r 


n 

i= 1 

r((Ef=i  n  +  "I*)  + 1)  r(Ef=i «») 


nf=ir(n 


+ TO* + 1)  ni=ir(«*) 


IIi=ir(TO* +  ?•*  +  <**) 

r(Ef=i  n  +  to*  +  ELt  «*) 


(31) 


using  the  fact  that  ad  =  T(a:  +  1). 

Cancelling  terms  and  removing  constant  terms  with 
respect  to  r, 


P(r  |m,  p) 

a  r(i  +  m  +  EE  ri) 
r(A  +  M  +  Ef=1r*) 

T 


i= 1  '  v  1  1  '  7 

where  A  =  ELi  a*>  M  =  EE  mi,  and  <?*  =  1-p*.  As 
the  individual  m*’s  are  still  part  of  the  resulting  poste¬ 
rior,  rather  than  just  the  M  as  in  15,  the  distribution 
on  targets  remaining  in  the  overall  area  and  sub-areas 
will  depend  on  the  observed  m, .  This  means  that  five 
heads  resulting  from  coin  one  and  zero  from  coin  two 
will  give  a  different  prediction  on  total  flips  from  coin 
one  than  would  zero  heads  from  coin  one  and  five  from 
coin  two,  even  if  the  overall  distribution  of  total  flips 
would  be  the  same  for  fair  coins. 


Qi 


;  l  (mi  +  n  +  a i ) 


V(r 


11 


(32) 


3  Expectation  Calculations 

The  goal  of  the  derivation  above  is  to  calculate  vari¬ 
ous  measures  of  effectiveness  which  can  be  represented 
as  expected  values  with  respect  to  distribution  32.  Of 
particular  interest  are  the  two  measures  of  effective¬ 
ness: 

•  The  expected  number  of  remaining  targets, 
and 

•  The  threat,  or  probability  of  damage,  to  traffic 
transiting  the  area. 

The  first  quantity  can  be  expressed,  using  the  lin¬ 
earity  of  the  expectation  operator,  by 

^(E^J  =  Ee^)-  (33) 

For  the  second  quantity,  define  a  to  be  the  prob¬ 
ability  that  a  transiting  vessel  will  not  be  damaged 
by  a  single  target.  Then  the  probability  of  not  being 
damaged  by  r  targets  is  ar .  Thus,  in  a  single  area 
the  probability  of  safe  transit  is  1  —  ar.  Assuming  a 
transiting  vessel  must  pass  each  of  T  areas  safely  given 
probability  of  safe  transit  a*  for  a  single  target  in  each 
sub-area  and  r,  targets  in  each  sub-area,  the  threat 
given  r  =  (7*1, . . . ,  ?’t)  remaining  targets  is  D[i=i  a*n- 
Defining  D  to  be  the  event  that  a  ship  transiting  the 
minefield  is  damaged  by  any  target, 

T 

P(D|R  =  r)  =  l  — (34) 

i—l 

The  expected  value  of  this  probability  over  R  = 
(Pi, . . . ,  Rt)  is  then: 

OO  OO 

P(D)  =  £•••  £  P(D|R  =  r)P(R  =  r)(35) 

n—0  vt—O 

OO  OO 


E  "•  E  T(D  r)P(r  m,p) 

7*^—0  T'j'  =0 

(36) 

E(P(D|R  =  r)) 

(37) 

T 

E(l-IR) 

i=l 

(38) 

T 

l-E(IR). 

(39) 

In  both  cases,  the  desired  measure  of  effectiveness 
can  be  represented  in  terms  of  the  expected  values, 
with  respect  to  distribution  32,  of  separable  functions. 

4  Implementation  of  Expecta¬ 
tion  Calculations 

In  order  to  simplify  the  presentation,  define 

t(i  +  m  +  Em?  r0 

K(  r)  =  -4 - f 

r(A  +  M  +  ELn) 
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and 

rr(r  +  rrii  +  at) 

Mr)  =  *  r(r  +  1)  - 

Assume,  here  and  below,  that  the  function  g  :  NT  — >  R 
is  separable  and  can  be  written  in  the  form 

9  (r)  =  9i  (ri)  92  O2)  ■■■9t  (rT)  (40) 

Finally,  define 

N  N  T 

Sn(9)  =  "•  K(r)Ylhi(ri)9i(n)  (41) 

7*1=0  r*T=0  7=1 

N 

=  hi(ri)9i(ri)  ■  ■  ■ 

ri—0 

N 

^2  hT(rT)gT(rT)  ■  K(r)  (42) 

vt— 0 

and 

S  ( g )  =  lim  SN  ( g ) .  (43) 

N^oo 

By  direct  comparison  of  equation  32  and  41  we  see 
that  the  normalization  constant  for  distribution  32  is 
S(l),  and  that  the  expected  value  of  separable  function 
g  with  respect  to  distribution  32  is  given  by  E(g)  = 

S(g)/S(  l). 

Since  a  closed  form  expression  for  S(g)  is  unlikely  to 
exist,  we  assume  that  N  has  been  chosen  large  enough 
so  that  S^ig)  is  a  sufficient  close  to  S(g )  and  discuss 
the  calculation  of  S]y(g).  From  41  we  see  that  the 
obvious  method  of  computing  SV(fl)  has  complexity 
O  ( NT )  which  severely  limits  the  applicability  of  distri¬ 
bution  41.  However,  this  limitation  can  be  overcome. 

The  difficulty  in  calculating  Sjv(<sO  is  caused  by  in¬ 
separability  of  the  kernel  function  K .  However,  K  still 
has  a  particularly  nice  structure.  Specifically,  if  we 
define 


k  (x)  =  r  +  M  +  ^ 

1  j  r(A  +  M  +  x) 

with  x  £  {0, 1, ... ,  NT},  then 


(44) 


K  (r)  =  k 


(45) 


Representing  k  by  its  discrete  Fourier  expansion,  we 
obtain 


where 


k  (x) 


NT 


3=0 


_  exp  (2nj  i) 

J  ~  vnt + 1 

NT 

Cj  =  (n)b2 

n—0 


(46) 


(47) 

(48) 


Using  this  expansion  to  represent  K,  we  have 


K(  r) 


NT 

J2cjb(;i+r2+'+rT) 

3=0 

NT  /  T 

j= 0  \  k—1 


(49) 

(50) 

(51) 


Thus,  while  K  is  inseparable,  it  is  “nearly  separa¬ 
ble”  in  the  sense  that  it  can  be  represented  as  a  finite 
sum  of  separable  functions. 

Inserting  51  into  the  equation  42  we  get,  after  some 
simplification, 

NT  N 

Sn{9)  =  b?hl  (ri)  91  (n) 

j=  0  7*1=0 

N 

■■■  b}ThT  (rT)  9t  (rr)  (52) 

tt—0 

NT  /  T  /  N 

=  n  cj  ( n  ( b? 9k  hk 

j= 0  \fc=l  \rit=  0 


The  computational  complexity  of  calculating  S^ig) 
using  expression  52  is  O  (N2T2) .  This  provides  re¬ 
markable  speed-up  in  computation  time  and  greatly 
increases  the  applicability  of  distribution  32. 


5  Summary 

This  paper  presents  a  method  of  estimating  the  number 
of  targets  remaining  in  an  area,  and  the  threat  to  tran¬ 
siting  traffic,  after  an  attempt  has  been  made  to  detect 
an  overall,  unknown  number  of  targets.  The  method 
is  based  on  expectation  calculations  with  respect  to 
a  distribution  which  was  derived  using  a  Dirichlet- 
Multinomial  prior  distribution  and  assumptions  about 
the  effectiveness  of  the  original  search  attempt.  The  re¬ 
sulting  calculations,  in  their  canonical  form,  are  com¬ 
putationally  prohibitive.  An  alternative  form  of  the 
calculations  is  presented  which  provides  a  mechanism 
for  calculating  the  expectations  in  a  negligible  amount 
of  time. 
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